Information retrieval and terminology extraction in online resources for patients with diabetes.
نویسندگان
چکیده
Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall, precision and f-measure.
منابع مشابه
بررسی تطبیقی اصطلاحنامه معارف اسلامی و علوم قرآنی
This study examines the comparative strengths and weaknesses of the thesaurus and thesaurus Quranic teachings of the Koran. In today's society where the documents are kept electronically, retrieval and dissemination of information for the development of research, much greater importance of saving documents and thesaurus that is the basis for indexing in various sciences, One of the solutions fo...
متن کاملA Swedish Scientific Medical Corpus for Terminology Management and Linguistic Exploration
This paper describes the development of a new Swedish scientific medical corpus. We provide a detailed description of the characteristics of this new collection as well results of an application of the corpus on term management tasks, including terminology validation and terminology extraction. Although the corpus is representative for the scientific medical domain it still covers in detail a l...
متن کاملBootstrapping Term Extractors for Multiple Languages
Terminology extraction resources are needed for a wide range of human language technology applications, including knowledge management, information extraction, semantic search, cross-language information retrieval and automatic and assisted translation. We report a low cost method for creating terminology extraction resources for 21 non-English EU languages. Using parallel corpora and a project...
متن کاملTExtractor: a multilingual terminology extraction tool
This demonstration presents a tool (TExtractor) employed for enriching terminology sets in four languages: English, French, German and Spanish. We present the associated linguistic resources and the experimental results obtained in the medical domain. TExtractor has been developed within project LIQUID (IST-2000-25324), which aims at developing a cost-effective solution for the problem of cross...
متن کاملIntelligent and Online Evaluation of Diabetes using Wireless Sensor Networks and Support Vector Machines Algorithm
Objective: International Diabetes Organization estimates that there are 285 million people worldwide who suffer from diabetes, and this figure is expected to increase to 450 million in next 20 years. According to statistics issued by the World Health Organization, diabetes is considered among ten leading causes of death in world and its prevalence in the population is growing.This paper deals w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Collegium antropologicum
دوره 38 2 شماره
صفحات -
تاریخ انتشار 2014